AITopics | input similarity

2de5d16682c3c35007e4e92982f1a2ba-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 07:43:08 GMT

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)

Add feedback

On UMAP's True Loss Function

Neural Information Processing SystemsApr-25-2026, 07:43:04 GMT

UMAP has supplanted t-SNE as state-of-the-art for visualizing high-dimensional datasets in many disciplines, but the reason for its success is not well understood. In this work, we investigate UMAP's sampling based optimization scheme in detail. We derive UMAP's true loss function in closed form and find that it differs from the published one in a dataset size dependent way. As a consequence, we show that UMAP does not aim to reproduce its theoretically motivated high-dimensional UMAP similarities. Instead, it tries to reproduce similarities that only encode the knearest neighbor graph, thereby challenging the previous understanding of UMAP's effectiveness. Alternatively, we consider the implicit balancing of attraction and repulsion due to the negative sampling to be key to UMAP's success. We corroborate our theoretical findings on toy and single cell RNA sequencing data.

artificial intelligence, machine learning, similarity, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

2de5d16682c3c35007e4e92982f1a2ba-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 01:46:40 GMT

While the toy and scRNA-seq datasets do not have a split, we used the training and test set of CIFAR-10 jointly for the unsupervised UMAP dimensionreduction.

artificial intelligence, inductive learning, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

Add feedback

2de5d16682c3c35007e4e92982f1a2ba-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 01:46:36 GMT

high-dimensional similarity, similarity, umap, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Input Similarity from the Neural Network Perspective

Neural Information Processing SystemsDec-25-2025, 23:16:42 GMT

Given a trained neural network, we aim at understanding how similar it considers any two samples. For this, we express a proper definition of similarity from the neural network perspective (i.e.

input similarity, name change, neural network perspective, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Reviews: Input Similarity from the Neural Network Perspective

Neural Information Processing SystemsJan-27-2025, 00:11:18 GMT

I don't like the "motivation-as-introduction" section as it does not present a wide scope within which the paper lies.

input similarity, neural network perspective, validation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback

Reviews: Input Similarity from the Neural Network Perspective

Neural Information Processing SystemsJan-27-2025, 00:11:07 GMT

All of the reviewers found the proposed technique original and the theory interesting. The reviewers initially had concerns regarding the structure of the paper, relevance of some of the experiments, and comparison with perceptual loss. These concerns are alleviated given the author feedback. Assuming that the authors will integrate the author feedback into the paper and incorporate all of reviewers' feedback, I recommend acceptance as a poster.

author feedback, input similarity, neural network perspective, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Input Similarity from the Neural Network Perspective

Neural Information Processing SystemsOct-10-2024, 21:59:56 GMT

Given a trained neural network, we aim at understanding how similar it considers any two samples. For this, we express a proper definition of similarity from the neural network perspective (i.e. We study the mathematical properties of this similarity measure, and show how to estimate sample density with it, in low complexity, enabling new types of statistical analysis for neural networks. We also propose to use it during training, to enforce that examples known to be similar should also be seen as similar by the network. We then study the self-denoising phenomenon encountered in regression tasks when training neural networks on datasets with noisy labels.

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Unnatural language processing: How do language models handle machine-generated prompts?

Kervadec, Corentin, Franzon, Francesca, Baroni, Marco

arXiv.org Artificial IntelligenceOct-24-2023

Language model prompt optimization research has shown that semantically and grammatically well-formed manually crafted prompts are routinely outperformed by automatically generated token sequences with no apparent meaning or syntactic structure, including sequences of vectors from a model's embedding space. We use machine-generated prompts to probe how models respond to input that is not composed of natural language expressions. We study the behavior of models of different sizes in multiple semantic tasks in response to both continuous and discrete machine-generated prompts, and compare it to the behavior in response to human-generated natural-language prompts. Even when producing a similar output, machine-generated and human prompts trigger different response patterns through the network processing pathways, including different perplexities, different attention and output entropy distributions, and different unit activation profiles. We provide preliminary insight into the nature of the units activated by different prompt types, suggesting that only natural language prompts recruit a genuinely linguistic circuit.

machine-generated prompt, prompt type, template, (16 more...)

arXiv.org Artificial Intelligence

2310.15829

Country:

North America > Dominican Republic (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > China > Hong Kong (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

On UMAP's true loss function

Damrich, Sebastian, Hamprecht, Fred A.

arXiv.org Machine LearningApr-22-2021

UMAP has supplanted t-SNE as state-of-the-art for visualizing high-dimensional datasets in many disciplines, but the reason for its success is not well understood. In this work, we investigate UMAP's sampling based optimization scheme in detail. We derive UMAP's effective loss function in closed form and find that it differs from the published one. As a consequence, we show that UMAP does not aim to reproduce its theoretically motivated high-dimensional UMAP similarities. Instead, it tries to reproduce similarities that only encode the shared $k$ nearest neighbor graph, thereby challenging the previous understanding of UMAP's effectiveness. Instead, we claim that the key to UMAP's success is its implicit balancing of attraction and repulsion resulting from negative sampling. This balancing in turn facilitates optimization via gradient descent. We corroborate our theoretical findings on toy and single cell RNA sequencing data.

high-dimensional similarity, similarity, umap, (15 more...)

arXiv.org Machine Learning

2103.14608

Country: